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Abstract. A sequence of random variables, each taking values or 1, is 
called a Bernoulli sequence. We say that a string of length d occurs, in a 
Bernoulli sequence, if a success is followed by exactly (d — 1) failures before 
the next success. The counts of such d-strings are of interest, and in specific 
independent Bernoulli sequences are known to correspond to asymptotic d- 
cycle counts in random permutations. 

In this note, we give a new framework, in terms of conditional Poisson 
processes, which allows for a quick characterization of the joint distribution of 
the counts of all d-strings, in a general class of Bernoulli sequences, as certain 
mixtures of the product of Poisson measures. In particular, this general class 
includes all Bernoulli sequences considered in the literature, as well as a host 
of new sequences. 



1. Introduction 

In this note, we study the joint distribution of the counts of certain d-strings 
of all orders d > 1 arising in Bernoulli sequences. Previous work has used several 
different methods, including combinatorial, factorial moment, and Polya and Hoppe 
urn model methods to identify the joint count distribution with respect to a class 
of independent Bernoulli sequences. In this context, our main contribution is to 
introduce a new framework, using conditional Poisson processes, which allows for 
a concise derivation of the joint count distribution as a mixture of the product of 
Poisson measures with respect to all Bernoulli sequences considered before, as well 
as many others in a general class, including some dependent Bernoulli sequences. 

A Bernoulh sequence Y = {l^}„>i is a sequence of {0, l}-valued random vari- 
ables. For d > 1, we say that a d-string occurs if a 1 is followed by exactly {d — 1) 
O's before the next 1 in the Bernoulli sequence. Specifically, a d-string occurs at 
time n > 1 if Yn_d = 1 where 

YnYn+i for d = 1 
y„(l - Yn+i) • • • (1 - Yn+d-i)Yn+d for d > 2, 



YnM = 

that is, if (y„,...,r„+d) = (i,o,...,o,i) 



d-1 

Let Zd = J2n>i ^n,d be the count of all d-strings, for d > 1, and Z — {Zd ■ d > 1) 
be the "count vector" of strings. [In general, Z may have divergent components, but 
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for the Bernoulli sequences considered in this article it is easily shown (by taking 
expectations) that all components Zk are finite with probability 1.] 

In this notation, the general problem is to understand the distribution of Z 
and its connection to the underlying sequence Y. Aside from the problem's basic 
interest, d-strings and their counts from specific independent Bernoulli sequences 
have interpretations with respect to random permutations, record values, Bayesian 
nonparametrics, and species allocation models through Ewens sampling formula. 

We will use "=" to signify "equals in distribution," and li{X) to denote the 
law or distribution of the random variable X. Denote also Po(A) as the Poisson 
measure on R with intensity A, and KB) as the indicator of a set B. 

Example 1.1. Let §„ = {1, 2, . . . , n}, and consider the Feller algorithm to generate 
a permutation tt : §„ ^ §„ uniformly among the n! choices (cf. Feller (1945)): 

1. Draw an element uniformly from §„, and call it7r(l). If7r(l) = l, a 1-cycle 
is completed. If 7r(l) ^ 1, make another draw uniformly from S„ \ {7r(l)}, 
and call it 7r(7r(l)). Continue drawing from §„ \ {7r(l), 7r(7r(l))}, . . . naming 
them 7r(7r(7r(l))), and so on, until a cycle (of some length) is finished. 

2. From the elements left in S„ \ {7r(l), 7r(7r(l)), . . . , 1} after the first cycle is 
completed, follow the process in step 1 with the smallest remaining number 
taking the role of "1" to finish a second cycle. Repeat until all elements of 
§„ are exhausted. 

in) 

Let I), be the indicator that a cycle is completed at the fcth Feller draw from §„. 
A moment's thought convinces that are independent Bernoulli random 

variables with P(/f."'' = 1) = l/(n — fc + 1) as, independent of the past, exactly one 
choice at time 1 < fc < n from the remaining n — k + 1 members left in §„ completes 
the cycle. Denote C^""* as the number of fc-cycles in tt, 

' [I]^^lil-li'"^)I^+YZit'l\^;!il~ll''')I^ for2<fc<n. 

Now let Y be the independent sequence where P{Yk = 1) = 1/fc for fc > 1, so 

that Yk = for 1 < fc < n. Then, as F„, and Yn-k+iY[?=n-k+2i^ ^ ^0 foi' 

2 < fc < n all vanish in probability as n | cxd, we conclude for each fc > 1 that 

lim„^oo C{."'' = Zk. 

Finally, as is well-known, the asymptotic cycle counts {lim„ C^" •*}/,.> i are dis- 
tributed as independent Poisson random variables with respective means 1/fc for 

fc > 1 (cf. Kolchin (1971)). Hence, Z ^ Ukyi^^i^/'^)- [Example O in section 
2, gives a derivation in our Poisson process framework. See also Arratia-Barbour- 
Tavare (1992, 2003) for more discussion with Ewens sampling formula.] 

Example 1.2. Consider the standard nonparametric problem of estimating the 
unknown distribution function F from independent and identically distributed ob- 
servations {Xi}i>i. A Bayesian may place on a Dirichlet prior with parameters 
a/1 where a > and /t is a non-atomic probability measure. 

Let Yi — 1 and for n > 2 define y„ = 1 if Xn is a new observation, that is if 
Xn ^ {Xi, . . . , Xn-i}, and F„ = otherwise. Then, it can be shown that Y is an 
independent Bernoulli sequence with P{Yn — 1) = a/{a + n — l) for n > 1 and that 
(logn)~^ Sr=i Yi a a.s. The latter result can be interpreted in terms of counts 
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of strings in this Bernoulli sequence. See Korwar-HoUander (1973) for more details, 
and also Ghosh-Ramamoorthi (2003). 



In the literature, to our knowledge, only the count vectors of the following class 
of underlying independent Bernoulli sequences have been investigated. Denote the 
independent Bernoulli sequence Y where P{Yn = 1) = a/{a + b + n — I) for n > 1 
as Y = Bern(a, 6). The case a = 1, & = is Example 11.11 (see also Arratia- 
Tavare (1992)). The case a > 0, 6 = is Example 11.21 For this case, Arratia- 

Barbour-Tavare (1992) observe that the associated Z = nfc>i Po(a/fc) through 
connections with Ewens sampling formula. When a = 1, 6 > 0, Sethuraman- 
Sethuraman (2004), employing factorial moments, show that, given the value xq of 

a Beta(6, 1) random variable, Z = rifc>i ^ Such a distribution will 

be called a "mixture of independent Poisson factors." When a > and 6 > 0, 
Hoist (2007) extends further, using Polya and Hoppe urns, and establishes that, 

given the value xq of a Beta(6, a) random variable, Z = nfc>i ~ ^o)/^)' 
again a mixture of independent Poisson factors. We note also that several inter- 
esting studies of 1-strings preceded some of the above work, e.g. an unpublished 
manuscript of Diaconis, Chern-Hwang-Yeh (2000), Mori (2001), Joffe-Marchand- 
Perron-Popadiuk (2004), and references therein in these and the above papers. 

With this background, our main idea is that it is easier to study Z starting from 
an extrinsic "conditional marked Poisson process model" (CMPP) rather than di- 
rectly from the Bernoulli sequence. Namely, we prove that when the underlying 
Bernoulli sequence Y is generated through a CMPP model, the count vector Z is 
distributed as a mixture of independent Poisson factors in terms of model parame- 
ters (^Theorem 12.21) . As remarked earlier, the Poisson process techniques used here 
are different from previous methods and allow quick derivations. Perhaps inter- 
estingly, the sequences Y found in our model include many dependent Bernoulli 
sequences (some explicit examples are in section [SJ. However, the most general se- 
quence studied till now, the independent sequence Bern(a, 6) with a > and 6 > 0, 
can also be realized in our framework fProposition 13. 1) 1 . yielding a new proof of its 
count vector distribution. 

Our conditional marked Poisson process model also yields a new class of inde- 
pendent Bernoulli sequences which we call Berni(a,6). Denote the independent 
Bernoulli sequence Y where P{Yi = 1)^1, and P{Yn = 1) = a/(a + 6 + n — 2) for 
n > 2 as Y = Berni(a, h). The Berni(a, 6) sequence appends a 1 to the Bern(a, b) 
sequence and picks up one more d-string contributed by any leading O's in Bern(a, b). 
We show that the distribution of the count vector Z for Berni(a, b) for a > 0, 5 > 1 
is a mixture of independent Poisson factors (Proposition 14.11) . This result fails for 
< & < 1, and in this case even the distribution of Zi, the count of 1-strings in 
Berni(a, 6), is not a mixture of Poisson distributions fProposition 14.5]) . However, 
the distribution of Z in Berni(a, b) can be expressed through a recurrence relation 
for all values of b including < 6 < 1 (Proposition I4.3P . 

The plan of the article is to discuss the CMPP model, and prove the main theo- 
rem in section [2] In sections |3] and |4l the main theorem is applied to independent 
sequences Bern(a, 6) and Berni(a,5) respectively. Last, in section [Sj two explicit 
dependent Bernoulli sequences, arising from the CMPP model, are given. 
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2. CMPP MODELS 

The following "Poisson process" derivation of the distribution of Z with respect 
to Bern(l,0) (cf. Example II. ip motivates subsequent development. 

Example 2.1. Consider the following standard way to generate a Bern(l,0) se- 
quence. Let {Pi}i>i be independent, identically distributed (iid) Uniform[0, 1] ran- 
dom variables, and define Yn = I{(3n is a record), n > 1. Renyi's theorem shows 
that {Yn}n>i are independent and P(Yn = 1) = l/n for n > 1, that is Y = 
Bern(l,0). Let {Xi}i>i be the record values among {/3i}i>i. Notice that the point 
process N on [0, 1] defined by N{A) = J2i>i ^^i(^) is a nonhomogeneous Poisson 
process on [0, 1] with intensity 1/(1 — a;) (cf. Resnick (1994)). For each point Xi, we 
can associate a Geometric (l — X^) variable Li (a "mark") corresponding to the num- 
ber of uniform random variables in {/3i}i>i to the next record. Then, by thinning 
decompositions, Zk = J2i>i ^i-^i = ^) = J2i>i ([0, l])I{Li ^ k) for k > 1 are in- 
dependent Poisson variables with respective means 
for fc > 1. 

In a sense, the thrust of the following CMPP model and our main result (The- 
orem I2.2p below is to reverse the procedure in Example 12.11 By beginning with a 
given Poisson process and spacing variables, which themselves determine the count 
vector Z, we then see what associated Bernoulli sequence Y arises. 

Consider a sequence of random variables (X, L) = {{Xi, Li)}i>o on M x N where 
N = {1, 2, . . .}, and the point process TV on M given by ^(^)= I],j>i Sx.iA). Let 
also (7 : R — !■ [0, oo) be a probability density function (pdf), and for each a; S M 
r{x, •), q{x, ■) : N — > [0, 1] be probability mass functions, and Aj; : R ^ [0, oo) be an 
intensity function. 

Then, we say (X, L) is the conditional marked Poisson process A4{g,r, X,q) if 
the following hold: 

1. Xq has pdf g, 

2. conditional on Xq = xq, N is a. nonhomogeneous Poisson process with 
intensity function Xxo{-), 

3. P{Lo = fc|X) = r{Xo, fc) for fc > 1, and 

4. P(L„ = fc|X, Lo, Li,..., Ln-i) = q{Xn,k) for fc, n > 1. 

Let Lq — Lq, and L* = L*_^ + Lr for r > 1. We now define a Bernoulli sequence 
Y based on (X, L) as follows: Yn — I if n is of the form L* for some r > 0, and 
Yn — otherwise. Another way to say this is 

_ f when n < L^, or L* < n < L*^-^ for r > , , 

" ~ \ 1 when n = L; for r > 0. ^ 

Then, the count vector Z is given by 

n>l 

We note the zeroth mark Lq is not included in the above summation since any Yi 
with i < Lo is part of an initial segment of zeros of the sequence not preceded by 
a 1, and so does not contribute to any d-string, for d > 1. 

Theorem 2.2. Suppose J \w{x)q{x,k)dx < oo for all w £ R. and fc > 1. Then, 
the count vector Z associated with sequence Y, defined through CMPP (X, L) — 
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M{g,r,X,q), is distributed as follows. Given the value Xq — xo, 

Z = l[Po( f KXx)q{x,k)dx 
k>i ^"^ ^ 

Remark 2.3. The distribution of Z does not depend on the transition function r, 
consistent with the discussion of Lq before the theorem. 

Also, for a given k > 1, Zk '\s infinite with positive probabihty exactly when 
there is a set B such that P(Xo G i?) > and / \w{x)q{x^ k)dx = oo for w E B. 

Proof of Theorem \2.SX Recall the count vector representation ()2.2p . Condi- 
tional on = Xo, the point process M on R x N given by M{A x {/c}) — 
'Ylii>i^Xi{^)^{^i = fc) is a Poisson process on R x N with intensity function 
Xxo{x)q{x^k) (cf. Proposition 4.10.1 (b) Resnick (1994)). Hence, it follows that, 
given Xo = xo, the variables M(R x {fc}) = X]n>i ^i^n = k) = Zk are independent 
Poisson variables with respective means / Xxo{x)q{x, k)dx, for k > 1. ■ 

3. The sequence Bern(a, b) 

We now derive the count vector distribution for the sequence Bern(a, b) using a 
CMPP model. Denote, as usual, for a, /3 > 0, the Beta function 

r(a)T{(3) 

= rx^' ^^-'^ 

and let 

1. g{x) = a;''-i(l - xY-'^/B{b, a) on < a; < 1, the Beta(6, a) pdf, 

2. f{x,k) = x^-^{l - x) for fc > 1, 

3. Xw{x) = [«/(l — x)]I{w < a; < 1), and 

4. q{x, k) = x^^^{l - x) for fc > 1. 

Proposition 3.1. The model (X, L) = M{g,f,X,q) produces an independent 

Bernoulli sequence Y == Bern(a, b) for a > and 6 > whose count vector 
Z, conditional on the value of a Beta(5, a) random variable, is distributed as 
n,>iPo(a(l-a:S)/fc). 

Remark 3.2. As a corollary, by taking 6 | 0, we recover the count vector distribu- 
tion for Bern(a, 0) already considered in the literature as simply Z = Jlfc>i Po(a/fc). 
Note that (Xo, Lq) (0, 1) in distribution as 6 J, 0. 

The Poisson process in the above CMPP model with intensity A„,(-) can be 
generated in the following way. First, the point process formed by the record 
values from an iid sequence of Beta(l,a) random variables is a Poisson process 
with intensity a/(l — x), the Beta(l, a) failure rate (cf. Resnick (1994) Proposition 

4.11.1 (b)). Next, we thin this process as follows. Let Xo = Beta(6, a), and {Xi}i>i 
be the record values from an iid sequence of Beta(l,a) random variables, subject 
to Xj > Xo for z > 1. Then, conditional on Xo = xq, the point process N defined 
by N{A) — J2i>i^Xi{A^) is the desired Poisson process with intensity function 
Ko{x) = [a/(l - x)]I{xo <x <1). 

Proof of Provosition [^7T\ The second part on the count vector distribution follows 
from Theorem 12. 2[ noting for fc > 1, that 

Xx,{x)q{x,k)dx = j ax^-^dx = (3.2) 
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B{b, a) 



For the first part, we observe that the distribution of {Yi}i>i given through ()2.1|) 
is uniquely determined by the probabihties of cylinder sets of the form 

E{ko, fc„) = (Lq = fco, Li = fci, . . . , L„ = kn) (3.3) 
= (Yt = l for t e {Ko, Ki,..., Kn}, and Ft = otherwise for 1 < i < Kn'j 
where kg, ki, . . . , kn are positive integers and Kq = kg, Ki = Kq + fci, . . . , Kn = 

def 

Kn-i + kn are their partial sums. If the probability of sets of the form E = 
E{ko, . . . , kn) is a product of appropriate marginal probabilities then {y„, n > 1} 
will be the Bernoulli sequence Bern(a, b). We will proceed to estabhsh this. 

Let An = {0 < a;o < xi < • • • < a:„ < 1}. Using the Beta variables representation 
in Remark 13. 2[ write 

/n 
g(xo)f(xo, fco) ^PiXi e dxi\Xi > Xi-i)q{xi,ki) dxQ. 
1=1 

Since P{Xi e dxi\Xi > Xi-i) — a{l — Xi)'^^^ /{I — Xi-i)"^ dxi for 1 < i < n, we have 
further that the last line equals 

/ xl+''°-^Wx'l^-\l~XnTdxo...dXn (3.4) 

B{b + Kn-l,a+l) 

and, noting (|3.ip and aT{a) — V{a + 1), that (|3.4[) becomes 

n'lv' (« + ^ + ■ n:=o ih+K.-i) « + ^ + z - 1 ii & + - 1 

which is exactly ]\f^iP{Y, = 0) n"=o[^(^^-- = ^)IP{Yk^ = 0)] with Y specified 
as Bern(a, b). ■ 

4. The sequence Berni(a, 6) 

We will derive the count vector distribution for the sequence Berni(a, 6), and 
show a dichotomy depending on whether 6 > 1 or 6 < 1 . We first consider the case 
where a > and 6 > 1. Define 

1. g*{x) = x''~'^{l-xY /B{b-l,a+l) onO < a; < 1, the Beta(6 - l,a+ 1) pdf, 

2. r*(x,l) = l, 

3. A^(a;) = [a/(l - x)]I{w < x <l), and 

4. q*{x, k) = x^-^{l - x) for fc > 1. 

Proposition 4.1. The CMPP model (X,L) = M{g* ,r* ,\* ,q*) produces an inde- 
pendent Bernoulli sequence Y = Berni(a,5) for a > and b> 1, and, conditional 
on a Beta{b — l,a + 1) variable Xq — xq, the distribution of its count vector Z is 

Uk>iMc^a~4)/k)- 

Remark 4.2. As a corollary, by taking & J, 1, we find the count vector distribution 
for Berni(a, 1) to be simply Z = nfc>i Po(a/fc). [In fact, Berni(a, 1) coincides with 
the sequence Bern(a,0) mentioned earlier in Remark 13.21 ] 

Also, we note the Poisson process in the above CMPP model with intensity 

A* can be generated, as in Proposition 13. 1|. by taking Xq = Beta(6 — l,a + 1), 
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and {Xi}i>i as the sequence of records from an iid sequence of Beta(l, a) random 
variables, subject to the condition Xi > Xq- 

Proof of Proposition \JJ] We need only estabhsh the distribution of Y, as the last 
statement follows from Theorem 12.21 and the computation (|3.2I) . The calculations 
are similar to the proof of Proposition 13.11 Let fco = 1, /ci, fc2, . . . , A:„ be positive 
integers, and Kq = ko — 1, Ki = Kq + ki, . . . , Kn — Kn-i + fc„ be their partial 
sums. Recall the cylinder set defined in (13.311 and let 



def 

El = E{l,ki, . . . ,kn) = {Lq = l,Li ^ ki, . . . ,Ln = kn), 

and set An = {0 < xq < xi < ■ ■ ■ < Xn < 1}. Write, using the construction in 
Remark Hill that 



X 

1=1 



T-TT / ^{\-XnTdxo.■■dxn■ 
-l,a+l)JA^ fj^ 



Then, with (|3.ip and aT{a) = V{a + 1), the last line equals 
B(6 + X„ - 2,a+ 1) a" 



S(&-l,a+l) (6-l)m=i (& + ^. -2) 



nf=o''(fe-l + oJ^ 

nf=V' {<^ + b + r) (6-1) n:=i {h + K,- 2) 



J-J- a + h + i - I 



, , ^ , ^ , . ^ h + Kr-2 

which is exactly P{Yi = 1) Jill's = 0) n"=i[^(>K,. = 1)/^(>av = 0)] with Y 
specified as Berni (a, 6). ■ 

We now give the distribution of the count vector under Berni(a, h) for all a > 
and 6 > by conditioning on the location of the second 1 in the sequence Y. Denote 
Z(a, h) as the count vector with respect to Bcrni(a, h) for a > and > 0. Let Wn 
be the sequence whose nth co-ordinate is 1 and all the other co-ordinates are zero, 
for n > 1. Let also 

f ^ for n = 2 

Pn = < a prn-3 b+r for T7 > 3 

L Q+6+n-2 llr=0 a+b+r lOr n ^ O 

be the probability that the second 1 in Berni(a, b) occurs at time n > 2, and note 

Proposition 4.3. For a > and b > 0, we have 

C{Z{a,b)) = Y.PnC(z{a,b + n-l)+Wn-i), (4.1) 

n>2 

and Z(a, b + n — 1), conditional on the value xa of a Beta(fe + n — 2, a + 1) random 
variable, is distributed as nfc>i Po(a(l — a;§)/fc), for b > and n > 2. 
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Remark 4.4. The special case 6 = is interesting. The sequence Berni(a,0) is 
the independent sequence where Yi = I2 = 1 and P{Yn = 1) = a/{a + n — 2) for 
n > 3. That is, starting from time n = 2, the sequence is Berni(a, 1) = Bern(a, 0). 
Hence, by Proposition [XT] (see Remark l3.2p . Z(a, 0) is distributed as Z + Wi where 
Z = nA;>i Po(a/fc) is the count vector for Bern(a,0). This agrees with (|4.ip . since 
P2 = l (when b = 0) and Z(a, 1) = Z. 

Proof of Proposition \4-3\ The distribution of Z(a, b) follows by conditioning on 
the first time that y„ — 1 for n > 2. The distributions ofZ(a,6 + ri — 1) are 
completely specified by Proposition 14.11 and Remark 14.21 since b + n — 1 > 1 for 
n>2. m 

From (|4.ip . it is not clear whether the distribution of Z{a,b) is a mixture of 
product Poisson factors or not for < b < 1. We show now that even the first 
component Zi{a, b) is not a mixture of Poissons when < 6 < 1. 

Proposition 4.5. The distribution of Zi = Zi{a,b), the count of 1- strings in the 
Berni(a, 6) sequence, is not a mixture of Poissons when < 6 < 1, that is, there is 
no measure /i on [0, 00) such that 

£;[exp{tZi}] = [ e^'^^'-^Ufiiv). (4.2) 

'- J[0,oo) 

Proof. It is well known that when (14. 2p holds, the variable Zi is over-dispersed, 

def 

that is 0{Zi) = Var(Zi) — E{Zi) > 0. The proof now follows by the expression 
for 0{Zi) in (g!]) below. Let Y = Berni(a,6). Then, 

Zi = Y2 + Zi=Y2+ ^2X3 + Z+ (4.3) 

where Zi = X]i>2 ^'^'+1 ^^"^ ^1 ^ J2i>3^i^i+i-^ ^^'^ latter is independent 
of I2. Furthermore Zi, Z^ are the counts of strings of order 1 from Bern(a,5), 
Bern(a, 6+ 1), respectively, and their distributions are known from Proposition [3lT] 
Hence, by easy calculations 

EiZ^)-Al,,EiZ^)^—^^,EiZ,')- ^'^^^'^ . 



(a + b)' (a + 5+1)' ia + b){a + b+l) (a + b) 

From the identities in (|4.3p . we have 

^(^j _ a(a + l) ^(^2^_a(a+l) , a^a + l){a + 2) 



(a + b)' " (a + b) {a + b){a + b+l) 
This leads to 

a^(a + l)(6-l) 
(a + 6)2(a + &+l) 
which is negative for 6 < 1, and positive for 6 > 1 



5. Some dependent Bernoulli sequences 

Two examples of dependent Bernoulli sequences, arising in CMPP models with 
simple structures, whose count vector distributions arc mixtures of independent 
Poisson factors are given. 

First Sequence. For a > and 6 > 0, denote Pa,b as the probability distri- 
bution of the CMPP M{g,f,X,q) described in Proposition 13.11 which gives rise to 
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the BernouUi sequence Bern(a,5). Let now r+{x,k) = kx''-'^{l ~ x)'^ for fc > 1. 
Consider the associated CMPP model A^(g,r+,A,g) with g,X,q the same as in 
Proposition [nHJ Denote the probability measure under this model as P+ = P^i,- 
Note that r+(a;, k) = k[f{x, k) — f(x, k + 1)] where f{x, k) = x''~^{l — x). Recall 

the cylinder set E E{kQ, . . . , kn) from (13.31) where ko, ki, . . . , kn are positive 
integers, and Kq, Ki, . . . , Kn their partial sums. It is easy to see that 

P+{E) = ko[Pa,t[E{ko,...,k„)') - Pa,b[Eiko + l,ki,...,k„)') . 

From this expression, the distribution of Y can be recovered, and shown to be not 
that of independent Bernoulli variables. For instance, 

P+iY, = 1) = PaAYl = 1) - PadYl - 0, ^2 = 1) - + 



and analogously 



P+{Y2 



{a + b){a + b+l)' 
a'^{a + 2) + 2ba{a+l) 



Thus 

p+(Yi = i)p+(r2 = 1 



ia + b){a + b+l){a + b + 2) 

a2(a + l)(a2 + 2a + 2ba + 2b) 



(a + 6)2(a + 6+l)2(a + 6 + 2)' 
which does not match 

a2(a + 2) 



P+(Yi = 1,1^2 = 1) = 



{a + b){a + b+l){a + b + 2) 



for a, 6 > 0. 

Finally, by Remark 12. 3( we note the count vectors under Pa,h and P+ have 
the same distribution, and by Proposition 13.11 conditional on the value of xq of a 
Beta(6, a) variable, the count vectors are distributed as nfc>i Po(a(l — x^)/k). 

Second Sequence. Consider Pi,o, the measure for the CMPP model discussed 
in Example 12.11 and Remark 13.21 with respect to Bernoulli sequence Bern(l,0), 
where (Xq, Lq) = (0, 1), {Xi]i>i are the records from an iid Uniform[0, 1] sequence, 
and Li are Geometric (1 — Xi) ioi i > 1. 

Let P' stand for the measure under the "switched" CMPP model where (Xi, Li) 
and (X2,L2) are interchanged. The probabilities of Y on cylinder sets (cf. (|3.3p . 
under P', is given by 

P'(s(l,fci,...,A:„)) = P'(Li = fci,...,L„-fc„) 

= Pi.o(^2 = ki, Li ~ k2, and Li — ki for 3 < i < n) 

for positive integers kg = 1, fci, . . . , with Kq = l,Ki = Kg + ki, . . . , Kn = 
Kn-i + kn as their partial sums. Under both models Pi^o and P', as only two 
terms (Li,L2) exchange places, the associated count vectors are the same, and by 
Proposition 13.11 distributed as Y[k>i^'^i^/^)- 

We now show that {li}i>i is not an independent sequence under P'. From the 
calculation in (|3.4p with {Xq, Lq) = (0, 1), Yi = 1 and f{x, 1) = 1 (take b J, 0), and 
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Pl,0(L2 = l) = ^Pl,o(il = /c,i2 = l) 



fc>l 



^ / x'^ ^ {1 - X2)dxidx2 = 1/4. 



^0<a;i<a;2<l 



Also, 

P'(F2 = 1, Fa = 1) = Pi,o{Li = 1, L2 = 1) = ^1,0(^2 = 1, = 1) = 1/6, 

P'(r2 = 0,1-3 = 1) = Pl,o(i2 = 2) 



which give P'(F3 = 1) = 11/36. However, P'(F2 = 1)-P'(l3 = 1) = 11/144 7^ 

1/6 = P'(F2 = 1,^3 = 1). 
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^0<a;i<X2<l 



X2)dxidX2 



5/36, 
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